[00:01.690 --> 00:07.430]  Hello, and welcome to Cracking Beyond 15 Characters for under $500.
[00:07.750 --> 00:11.590]  Some alternative titles for this talk were Greener Pastures Over the Computational Wall
[00:11.590 --> 00:16.110]  or Why XKCD Advice is Easy to Understand and Hard to Do.
[00:16.210 --> 00:20.570]  That last one was suggested by my management, and we'll talk more about that one later.
[00:21.510 --> 00:25.330]  Hi, I'm Travis Palmer. I'm also a nerd, so I go by the pseudonym tramco.
[00:25.330 --> 00:29.390]  I am an OSCP, OSCE, and GMOD, if those letters mean something to you.
[00:29.390 --> 00:34.590]  And as is traditional for speakers, a couple of my favorite hobbies outside of security are on the right.
[00:34.810 --> 00:38.430]  D&D, Arma 3, and BBC. Yes, again, I am a nerd.
[00:38.670 --> 00:42.790]  And these slides arguably have too much on them, so this is the point where I ditch my ugly mug
[00:43.530 --> 00:47.890]  so that we can happily reveal that picture from the Red Team Village last year.
[00:48.290 --> 00:52.170]  Yeah, let's just say that room was pretty tight.
[00:52.170 --> 00:54.530]  I'm glad we're not trying to do that again this year.
[00:55.150 --> 00:58.350]  Everybody would have COVID by now if we were trying to be doing this live.
[00:59.530 --> 01:05.450]  For this talk, we're going to start with a fair bit of background before we get into the meat of things.
[01:05.450 --> 01:09.410]  I should give a disclaimer. This isn't going to be an all-inclusive introductory talk.
[01:09.410 --> 01:11.630]  There is just too much to get through.
[01:11.790 --> 01:15.670]  And if you've never heard of HashGap before and are thinking, where can I adopt one?
[01:15.690 --> 01:19.030]  This talk is going to be a little rough to follow. Just stick around.
[01:19.030 --> 01:22.750]  You'll still learn something, and you'll definitely have some keywords to Google later.
[01:22.750 --> 01:26.130]  So, let's address what should always be the first question.
[01:26.450 --> 01:27.210]  Why?
[01:27.510 --> 01:34.510]  Well, large web corporations are still using character requirements that were determined in 1985 by the Department of Defense to be resistant
[01:34.510 --> 01:35.830]  only resistant
[01:35.830 --> 01:39.810]  to online brute force attacks over a 1,200-baud modem.
[01:39.810 --> 01:43.670]  And even then, they should have been rotated every couple months.
[01:44.190 --> 01:49.370]  They also have complexity requirements that should seem familiar because they have become pervasive
[01:49.370 --> 01:55.710]  and are largely responsive for a lot of people honestly believing that they can take almost any English word
[01:55.710 --> 01:59.250]  and add a number and an exclamation point to it.
[01:59.350 --> 02:01.450]  And it makes it an acceptable password.
[02:01.450 --> 02:06.150]  Spoiler, these policies in isolation have not magically become more secure since 1985
[02:06.150 --> 02:09.390]  and are still guarding some very sensitive things.
[02:09.850 --> 02:13.910]  Thankfully, most of these websites and companies have some additional safety nets,
[02:13.910 --> 02:16.970]  like detecting questionable login origins or 2FA.
[02:17.510 --> 02:20.730]  Well, except Wikipedia, which, well, doesn't.
[02:20.790 --> 02:24.910]  And historically might have had one of the worst password policies on the internet.
[02:24.910 --> 02:28.110]  Thank you, Troy Hunt, for using a megaphone to broadcast the history here, which...
[02:29.130 --> 02:31.090]  I just... why?
[02:31.470 --> 02:35.450]  Your fix for zero-character passwords was a one-character requirement?
[02:35.810 --> 02:39.730]  There's also a large number of places where that eight-character standard, well, isn't,
[02:39.730 --> 02:43.170]  which frankly is a little baffling given some of the sensitivity here.
[02:43.170 --> 02:49.170]  PCI, the payment card industry, which only covers the systems where the payment card info is stored,
[02:49.170 --> 02:51.130]  this just seems lackluster.
[02:51.310 --> 02:54.770]  And on the other side of things, imagine what kind of information someone could collect
[02:54.770 --> 02:56.790]  for getting into your Facebook or Pornhub account.
[02:56.790 --> 02:59.430]  Imagine what they could do to your reputation by controlling it.
[02:59.750 --> 03:03.310]  And then there's eBay, which I should need to explain the level of financial ruin
[03:03.310 --> 03:06.150]  an auction site can bring if somebody else is bidding for you.
[03:06.210 --> 03:10.550]  Wells Fargo, a deserving punchbag perhaps, but they just haven't wisened up.
[03:10.550 --> 03:12.370]  That's a 14-character limit.
[03:12.990 --> 03:18.370]  And is there seriously a database with plain-text referentials limited to 14 characters in the background somewhere?
[03:19.770 --> 03:21.010]  Then there's Netflix.
[03:21.870 --> 03:26.350]  Does it make sense why hijacked Netflix accounts are so commonplace?
[03:26.350 --> 03:30.410]  I understand the want to make four-number pins usable, but really?
[03:30.750 --> 03:35.490]  Finally, we have a whole suite of devices in tech, and I'll pick on Cisco because I love them so much,
[03:35.490 --> 03:41.930]  that historically, by default, have no password, and the policy might get set by the people installing it,
[03:41.930 --> 03:45.130]  but, you know, eight characters suggest it.
[03:45.310 --> 03:48.550]  This seems like a deeply questionable recommendation when, again,
[03:49.070 --> 03:53.610]  those passwords were determined to only be resistant to attacks on networks in 1985.
[03:54.090 --> 03:56.990]  And while I'm on the trend of bashing trustworthy sources,
[03:56.990 --> 04:00.470]  it's only proper that I talk about NIST, or the National Institute of Science and Technology,
[04:00.470 --> 04:03.750]  and what they say, given that what they say often seeps into government regulation.
[04:03.750 --> 04:08.410]  The special publication 800-63B, released in 2017,
[04:08.410 --> 04:12.110]  said that the requirement should actually be 8-plus characters with no complexity,
[04:12.110 --> 04:17.170]  because someone did the statistical analysis and found out that people actually produce easier-to-reinforce passwords
[04:17.170 --> 04:19.030]  under complexity requirements.
[04:19.030 --> 04:22.070]  Like, adding a number and an exclamation point.
[04:22.170 --> 04:26.530]  They also say the maximum should at least be 64 characters.
[04:26.690 --> 04:29.310]  Hmm. I anted where things might be going.
[04:29.310 --> 04:31.370]  There's also a key section here that says
[04:31.370 --> 04:35.910]  increased password length is a key security control and to encourage passphrases.
[04:36.270 --> 04:37.490]  Hmm. You hear that?
[04:37.490 --> 04:39.670]  The future is coming. Slowly.
[04:39.670 --> 04:42.150]  And the 80s might finally end soon.
[04:42.550 --> 04:45.490]  The most recent guidance on February this year, mind you,
[04:45.490 --> 04:47.890]  this was not the first time it was published, from NIST and the FBI,
[04:47.890 --> 04:50.450]  is that, well, a lot more explict.
[04:50.450 --> 04:54.350]  15 characters without other complexity requirements is where we should be going.
[04:54.350 --> 04:56.830]  In fact, we should require it as soon as possible.
[04:57.470 --> 05:00.770]  And this is in line with what a bunch of other experts and sources
[05:00.770 --> 05:02.570]  that aren't the largest tech companies
[05:02.570 --> 05:04.930]  have been saying for much, much longer.
[05:05.290 --> 05:08.050]  In fact, a security consulting group that I greatly respect,
[05:08.050 --> 05:09.750]  Black Hills Information Security,
[05:09.750 --> 05:11.430]  makes this suggestion to their clients
[05:11.430 --> 05:15.070]  and uses a more paranoid standard of 20 characters internally.
[05:15.270 --> 05:15.970]  Neat.
[05:16.170 --> 05:17.490]  Though you might be asking,
[05:17.490 --> 05:19.550]  okay, why is Black Hills specifically on here?
[05:19.550 --> 05:20.690]  Well, it's because I'm picking on them
[05:20.690 --> 05:24.310]  and we're about to listen to some snippets of audio from a webcast last December.
[05:24.310 --> 05:26.350]  Mind you, this is a little cut up and out of context,
[05:26.350 --> 05:28.690]  but don't worry, most of the context is still there.
[05:29.270 --> 05:32.910]  7 characters is just easy to crack.
[05:32.910 --> 05:33.910]  How easy?
[05:34.650 --> 05:36.170]  If you go to the next slide,
[05:36.170 --> 05:38.950]  my son's computer, he's got a gaming computer.
[05:38.950 --> 05:41.690]  It takes 8 minutes to crack an LM hash.
[05:42.130 --> 05:44.290]  If you take the same 14 character password,
[05:44.290 --> 05:46.790]  it would take 4.3 billion years.
[05:47.050 --> 05:48.430]  Well, we got time.
[05:48.710 --> 05:51.910]  This is kind of what I'm talking about with regards to
[05:51.910 --> 05:55.230]  the password policy that they set in 1985.
[05:55.230 --> 06:01.650]  I mean, in 1985, an 8 character password policy was secure forever.
[06:03.230 --> 06:04.410]  90 days.
[06:04.410 --> 06:05.870]  90 days, okay.
[06:05.870 --> 06:09.230]  So it was secure for long enough for them to...
[06:10.910 --> 06:12.990]  we don't need to do anything.
[06:13.730 --> 06:17.270]  But, I mean, as technology changes,
[06:17.270 --> 06:19.610]  so do the password policies need to.
[06:19.650 --> 06:21.850]  Hey, this is warfare, right?
[06:21.850 --> 06:24.090]  Everything is move, counter, move.
[06:24.090 --> 06:26.210]  We have electronic countermeasures.
[06:26.210 --> 06:28.530]  You have electronic counter-countermeasures.
[06:28.710 --> 06:31.610]  And then it goes up above that and I lose track.
[06:31.610 --> 06:35.150]  So we're getting questions about
[06:35.150 --> 06:39.530]  have you run the Google common word attack against passphrases?
[06:39.530 --> 06:43.210]  So, like, if you just load words into a cracker,
[06:43.210 --> 06:46.030]  then doesn't that crack passphrases?
[06:46.030 --> 06:48.130]  Sure. Yes, it does.
[06:48.290 --> 06:52.310]  How many combinations of four-word passwords are there?
[06:54.410 --> 06:56.490]  So how long is that going to take you?
[06:56.490 --> 06:59.490]  What if you start using words from a foreign dictionary?
[06:59.810 --> 07:02.510]  What if you start putting salt into those words?
[07:02.510 --> 07:05.130]  Like, but I just, I think if four words,
[07:05.130 --> 07:08.330]  and if you start using special characters as a spacer between your words
[07:08.330 --> 07:10.390]  and things like this,
[07:10.390 --> 07:14.370]  the question is, are you more secure than you were with eight?
[07:15.610 --> 07:18.250]  But, yes, there are attacks against everything.
[07:21.110 --> 07:23.390]  So, Darren kind of summarized this up,
[07:23.390 --> 07:25.430]  this from-the-field kind of our results,
[07:25.430 --> 07:26.950]  but I just did want to talk about it.
[07:26.950 --> 07:30.070]  We have no success with people who have 15-character passwords.
[07:30.170 --> 07:33.130]  Approach is zero. And we have tried these things.
[07:33.130 --> 07:35.250]  Now, we're constrained in our time.
[07:35.250 --> 07:37.190]  Normally, we don't have more than a week or so
[07:37.190 --> 07:38.870]  to do our password guessing,
[07:38.870 --> 07:41.910]  whereas ATT&CK, according to a Verizon report,
[07:41.910 --> 07:44.290]  they've got, like, what, nine months, something like that?
[07:44.890 --> 07:46.530]  Okay, another question, Jason.
[07:46.530 --> 07:49.530]  Yes, so this question's come up a few times.
[07:49.530 --> 07:51.450]  Are spaces legal?
[07:51.670 --> 07:55.790]  Yes, they are a special character, and they are very good.
[07:56.070 --> 07:57.910]  And it depends on the...
[07:58.910 --> 08:00.650]  And I'm going to cut it there.
[08:00.730 --> 08:02.830]  So, they have some pretty solid reasons,
[08:03.150 --> 08:04.770]  a couple of pointed recommendations,
[08:04.770 --> 08:07.350]  but perhaps the most interesting thing mentioned
[08:07.350 --> 08:10.010]  is that they have been given hashes from a domain
[08:10.010 --> 08:11.950]  with a 15-character passphrase policy
[08:11.950 --> 08:13.830]  specifically to crack them.
[08:13.830 --> 08:17.050]  And the amount they can crack in a week approaches zero.
[08:17.050 --> 08:18.630]  Basically, no success.
[08:19.130 --> 08:22.330]  There's also that mention of spaces are a special character.
[08:22.430 --> 08:24.650]  We'll get back around to that one.
[08:25.370 --> 08:27.370]  They are a character, certainly.
[08:27.930 --> 08:30.930]  Any case, approaching zero success.
[08:31.290 --> 08:33.550]  Pack it in. Presentation over.
[08:34.190 --> 08:35.690]  That's all, folks.
[08:37.610 --> 08:39.810]  No? You're not buying that?
[08:39.810 --> 08:41.970]  Well, good, because neither am I,
[08:41.970 --> 08:43.930]  and I should address the final reason I'm here
[08:43.930 --> 08:46.390]  presenting on this topic in particular,
[08:46.390 --> 08:48.290]  which is the CISO of my company.
[08:49.410 --> 08:51.090]  So, getting on the password policy,
[08:51.090 --> 08:53.850]  so, taking that innovative kind of incubator mindset
[08:53.850 --> 08:55.210]  and looking at passwords.
[08:55.310 --> 08:58.730]  So, passwords are something that's highly audited,
[08:58.730 --> 09:00.370]  because it's very easy to do.
[09:00.370 --> 09:01.870]  If you're an auditor and you're trying to assess
[09:01.870 --> 09:03.250]  whether somebody's passing muster,
[09:03.250 --> 09:04.670]  you can have this rule set,
[09:04.670 --> 09:06.830]  and you can walk in and either they pass or they fail.
[09:07.110 --> 09:08.250]  And you'll see a lot of that,
[09:08.250 --> 09:10.110]  and a lot of the companies have this password policy
[09:10.110 --> 09:11.470]  that comes straight out of audits.
[09:11.470 --> 09:14.150]  Eight characters and three or four of uppercase, lowercase,
[09:14.150 --> 09:15.230]  special characters and numbers.
[09:15.230 --> 09:16.110]  There, I just rattled it off.
[09:16.110 --> 09:17.870]  It's been almost 20 years of that.
[09:19.090 --> 09:21.150]  So, we looked at that, though,
[09:21.150 --> 09:23.330]  and as we went through the red teaming,
[09:23.330 --> 09:26.050]  we found that one step in what we call the kill chain,
[09:26.050 --> 09:28.290]  if an attacker were successful,
[09:28.290 --> 09:29.730]  and we actually give them a leg up,
[09:29.730 --> 09:31.010]  we actually bring hackers on
[09:31.010 --> 09:33.210]  and bring them in as fake employees.
[09:33.210 --> 09:35.670]  We give them a laptop and a password and they start.
[09:35.890 --> 09:37.210]  And we found one step in the kill chain
[09:37.210 --> 09:40.070]  was taking all the passwords and cracking them.
[09:40.070 --> 09:43.570]  And they would run computers with graphic processing unit,
[09:44.170 --> 09:46.390]  augmentation, and they'd run for hours,
[09:46.390 --> 09:48.350]  and they'd be able to crack a few passwords
[09:48.350 --> 09:50.090]  and hopefully they'd find a privileged account,
[09:50.090 --> 09:52.010]  and then they were off to the races with that.
[09:52.070 --> 09:53.310]  So, while that was only one step
[09:53.310 --> 09:55.070]  in what I call the kill chain,
[09:55.070 --> 09:57.150]  we just said, you know, we want to win that battle.
[09:57.150 --> 09:58.970]  And here we are compliant,
[09:58.970 --> 10:00.830]  but it's just not getting it done.
[10:00.830 --> 10:02.430]  I'm just meeting compliance.
[10:02.870 --> 10:05.030]  So, we ran some math on the whiteboard
[10:05.030 --> 10:06.030]  and we said, well, what if we went out
[10:06.030 --> 10:08.070]  to a really long password, 15 characters,
[10:08.070 --> 10:10.070]  but got rid of all the complexity requirements?
[10:10.710 --> 10:13.510]  And, you know, I have a lot of people
[10:13.510 --> 10:16.010]  with high math SAT scores on my team.
[10:16.010 --> 10:17.130]  Let's put it that way.
[10:17.510 --> 10:19.070]  So, we had some great whiteboard battles
[10:19.070 --> 10:20.290]  and, you know, how much better would it be
[10:20.290 --> 10:21.250]  and that sort of thing.
[10:22.390 --> 10:24.030]  But what kept creeping in there
[10:24.030 --> 10:26.110]  because of kind of that tick box mentality
[10:26.110 --> 10:27.770]  was, well, we have to have complexity.
[10:27.830 --> 10:28.690]  Everybody's expecting that.
[10:28.690 --> 10:30.810]  We have to have uppercase and lowercase and on and on.
[10:30.810 --> 10:32.430]  But we said, well, let's just try it out.
[10:32.950 --> 10:35.570]  So, the math held and we felt pretty good about it.
[10:35.570 --> 10:36.950]  And then around that time,
[10:36.950 --> 10:39.750]  the National Institute of Standards and Technology, NIST,
[10:39.750 --> 10:41.310]  released an updated standard.
[10:41.310 --> 10:42.870]  And they said, well, length is king.
[10:42.870 --> 10:44.390]  And if you can get a longer password,
[10:44.390 --> 10:45.470]  you can get rid of some of that stuff
[10:45.470 --> 10:47.750]  as long as you look for commonly used passwords
[10:47.750 --> 10:49.030]  and block them out.
[10:49.310 --> 10:51.230]  So, long story short, we did it.
[10:51.230 --> 10:53.730]  And it took about 90 days to roll it in.
[10:54.090 --> 10:56.990]  And the impact has been substantial.
[10:57.070 --> 10:58.830]  And the only reason we were able to do that
[10:58.830 --> 11:01.210]  is because we innovated and created this thing
[11:01.210 --> 11:02.630]  that we called the Kraken
[11:02.630 --> 11:04.410]  that every single day tries to crack
[11:04.410 --> 11:06.010]  all the passwords in the company.
[11:06.130 --> 11:08.250]  And we see the success rate of that machine
[11:08.250 --> 11:09.930]  just dropping precipitously.
[11:09.930 --> 11:10.790]  We saw it.
[11:11.050 --> 11:12.530]  And I'm going to cut it there.
[11:13.070 --> 11:16.970]  So, yeah, that's out in the public domain.
[11:18.290 --> 11:22.750]  And we need to talk about that math Jerry mentioned.
[11:23.270 --> 11:26.730]  See, the math might hold up,
[11:26.730 --> 11:28.490]  but the equation isn't realistic.
[11:28.490 --> 11:30.530]  There's more than one way of approaching this problem.
[11:30.530 --> 11:32.850]  And I've heard the arguments spun multiple different ways.
[11:32.850 --> 11:34.910]  The first of which is it's inconceivable
[11:34.910 --> 11:37.230]  to crack a 15-character password
[11:37.230 --> 11:39.290]  because brute-forcing through all the possible numbers
[11:39.290 --> 11:42.590]  and lowercase letters is 1.6 sextillion combinations.
[11:42.930 --> 11:44.110]  Another argument I have heard
[11:44.110 --> 11:46.330]  that comes from a better standpoint
[11:46.710 --> 11:49.050]  is assuming the passphrase is made up of words.
[11:49.050 --> 11:51.390]  There are a lot of words around five characters or more,
[11:51.390 --> 11:53.050]  so you can say a 15-character passphrase
[11:53.050 --> 11:54.490]  is probably three words,
[11:54.490 --> 11:56.850]  which is extremely difficult.
[11:56.870 --> 11:59.010]  You'd have to go through more than 50 quadrillion options
[11:59.010 --> 11:59.770]  to be sure.
[11:59.770 --> 12:00.570]  Okay, big number,
[12:00.570 --> 12:02.670]  but certainly smaller than the last one.
[12:03.050 --> 12:05.510]  And the last one I've heard,
[12:05.510 --> 12:07.730]  perhaps the most educated of these strawman arguments,
[12:07.730 --> 12:10.430]  is based on the common requirement of eight characters doubling
[12:10.430 --> 12:13.150]  that gets you just above the 15-character requirement,
[12:13.150 --> 12:14.790]  so maybe we should think about passphrases
[12:14.790 --> 12:16.530]  as just multiple passwords combined.
[12:16.530 --> 12:18.950]  In which case, the lowest margin for cracking
[12:18.950 --> 12:23.010]  all of the combinations of two passphrases
[12:23.010 --> 12:26.170]  is just the combinations of two common passwords combined.
[12:26.170 --> 12:28.070]  Which, if we're using the RACU dictionary,
[12:28.070 --> 12:30.450]  is still over 205 trillion,
[12:30.450 --> 12:32.530]  which they'll tell you is unreasonable.
[12:32.850 --> 12:34.490]  Well, I'm here to tell you that all of these arguments
[12:34.490 --> 12:36.990]  that say that cracking past 15 characters,
[12:36.990 --> 12:39.270]  regardless of reasoning, can be undermined,
[12:39.270 --> 12:42.330]  and they can be undermined with only three factors.
[12:42.470 --> 12:43.890]  What are those factors, you might ask?
[12:43.890 --> 12:45.230]  Are you just teasing us?
[12:45.230 --> 12:47.390]  Is this a timeshare scheme in disguise?
[12:47.510 --> 12:51.150]  No, the big secret weakness is humans.
[12:51.810 --> 12:52.690]  Inconceivable!
[12:53.090 --> 12:53.870]  I know.
[12:55.010 --> 12:57.670]  Specifically, humans are bad at big numbers,
[12:57.670 --> 12:59.550]  humans pack complexity on the ends,
[12:59.550 --> 13:01.670]  the ends of passwords or passphrases,
[13:01.670 --> 13:03.190]  and humans pick similar things.
[13:03.190 --> 13:04.690]  Similar things to other humans,
[13:04.690 --> 13:06.890]  common things are common because they're commonly picked.
[13:07.350 --> 13:08.630]  Before I get too far,
[13:08.630 --> 13:11.630]  we should define some limits on what is reasonable,
[13:11.630 --> 13:15.550]  and I should address the other part of the clickbaity title of this talk.
[13:15.550 --> 13:17.070]  Why the $500 limit?
[13:17.310 --> 13:21.030]  Well, besides it being a nice round number that is easy to pitch,
[13:21.030 --> 13:23.530]  it's also the roundabout cost of an entertainment system,
[13:23.530 --> 13:26.530]  and you can safely assume that every type of threat actor
[13:26.530 --> 13:27.930]  you might need to be worried about,
[13:27.930 --> 13:29.550]  including LulzSec and the Script Kitties,
[13:29.550 --> 13:31.170]  has access to this amount of money.
[13:31.170 --> 13:33.690]  It's also an amount that can pack a sizable punch
[13:33.690 --> 13:35.090]  regardless of the path you take,
[13:35.090 --> 13:37.570]  either owning a cracking rig or renting out cloud computing.
[13:37.570 --> 13:40.890]  $500 is more than enough to go off and build a small cracking system
[13:40.890 --> 13:44.030]  out of a GPU from eBay and some outdated desktop parts,
[13:44.030 --> 13:45.910]  which you might have found in a dumpster,
[13:45.910 --> 13:50.750]  which should net you the theoretical max of 53 gigahashes against NTLM.
[13:50.870 --> 13:56.710]  As for the cloud, in AWS, $500 will get you 68 hours on a spot instance,
[13:56.710 --> 14:00.630]  which, yes, might get preempted. Probably won't, though.
[14:00.630 --> 14:03.430]  With eight V100 GPUs, which you can do manually,
[14:03.430 --> 14:05.750]  or you can use a management tool like CoalFire's NPK
[14:05.750 --> 14:07.510]  to manage the spot instances for you,
[14:07.510 --> 14:10.010]  for both scale and automation of the attack.
[14:10.190 --> 14:12.790]  If you're an attacker, and what you're planning on doing
[14:12.790 --> 14:14.150]  is a series of short-running attacks,
[14:14.150 --> 14:16.350]  instead of waiting 68 hours to get all the results,
[14:16.350 --> 14:19.690]  why not just spend 68 instances and get the results in one hour?
[14:20.110 --> 14:22.090]  Sadly, NPK doesn't work with Google Cloud,
[14:22.090 --> 14:23.710]  which actually appears to be cheaper,
[14:23.710 --> 14:27.190]  and if you squeeze out everything you can,
[14:27.190 --> 14:30.930]  you can get 83 hours of time on an instance with eight V100 GPUs.
[14:30.930 --> 14:33.550]  Which, besides being terrifically cheap,
[14:33.550 --> 14:35.850]  brings us to a terrifying theoretical max
[14:35.850 --> 14:40.390]  of 189.3 petahashes of NTLM, we can guess.
[14:40.390 --> 14:42.850]  Not mega, not giga, not tera.
[14:43.110 --> 14:45.990]  Peta. 189 quadrillion.
[14:45.990 --> 14:47.990]  If we compare that to that last strawman argument
[14:47.990 --> 14:50.210]  of the ROCU dictionary on top of itself,
[14:50.210 --> 14:52.990]  sure, theoretical isn't real performance,
[14:52.990 --> 14:56.850]  but the difference isn't off by a factor of a thousand times.
[14:57.150 --> 14:59.090]  Although humans aren't bad with big numbers.
[14:59.370 --> 15:02.730]  Which brings us around to the second strawman argument.
[15:02.990 --> 15:04.630]  I'd like to throw in an alternative equation
[15:04.630 --> 15:06.950]  that is still very conservative for how many guesses it takes
[15:06.950 --> 15:08.970]  to crack a three-word passphrase.
[15:09.170 --> 15:11.670]  All of the words that are obsolete to the third power
[15:12.190 --> 15:15.070]  divided by the number of people in the organization
[15:15.070 --> 15:17.670]  or number of hashes an attacker has to crack,
[15:17.670 --> 15:20.370]  all of which is divided by the factor of human laziness squared.
[15:20.370 --> 15:22.930]  How do you quantify human laziness, you might ask?
[15:22.930 --> 15:24.450]  Don't ask me, but trust me,
[15:24.450 --> 15:27.770]  the value of that variable always seems to be greater than one.
[15:28.050 --> 15:31.050]  This is a somewhat joking equation,
[15:31.050 --> 15:32.830]  but the very conservative hypothesis here
[15:32.830 --> 15:34.770]  is that people making a passphrase
[15:34.770 --> 15:36.490]  will use words from languages they know
[15:36.490 --> 15:39.470]  and can think of under the duress of password creation,
[15:39.470 --> 15:41.690]  which tends to be a much more limited set.
[15:41.690 --> 15:43.330]  In fact, I'm going to say now and support later
[15:43.330 --> 15:44.970]  that the pool to pick from for a lot of people
[15:44.970 --> 15:47.770]  seems to be between 32,000 and 64,000 words.
[15:48.050 --> 15:49.330]  Which brings me to a point
[15:49.330 --> 15:51.890]  where we are going to need to break out a lot of math.
[15:51.950 --> 15:53.610]  Because underminer number one,
[15:53.610 --> 15:55.490]  people are bad at big numbers.
[15:55.690 --> 15:57.630]  Thankfully, people are good at spotting trends
[15:57.630 --> 15:59.130]  after they have all the information.
[15:59.410 --> 16:01.610]  So let me give you something that is easier to parse,
[16:01.610 --> 16:02.870]  or hopefully easier to parse.
[16:02.870 --> 16:05.290]  This is a chart of the actual computational limits
[16:05.290 --> 16:06.810]  of combining words together
[16:07.300 --> 16:09.250]  in an attack on a single 2080 Ti
[16:09.250 --> 16:10.950]  on NTLM hashes,
[16:10.950 --> 16:13.990]  which I picked because it's a consumer GPU and I have access to one.
[16:13.990 --> 16:15.350]  The horizontal axis of this chart
[16:15.350 --> 16:17.690]  is the number of words in a passphrase.
[16:17.690 --> 16:20.230]  The vertical axis is the size of the dictionary,
[16:20.230 --> 16:21.630]  or perhaps a more useful way to think about it,
[16:21.630 --> 16:23.670]  the rarest word in a passphrase that can be cracked.
[16:23.770 --> 16:25.350]  Because a logical attacker is going to use
[16:25.350 --> 16:26.870]  the list of words by frequency.
[16:26.870 --> 16:30.110]  The bigger that list is, the rarer the words are going to be that are in it.
[16:32.510 --> 16:33.690]  Now, all the numbers
[16:34.270 --> 16:35.050]  in the colored boxes
[16:35.050 --> 16:37.770]  are the amount of minutes required to complete
[16:37.770 --> 16:40.170]  an attack search space.
[16:40.170 --> 16:41.630]  And the coloration tells you how long a time
[16:41.630 --> 16:43.890]  that is relative to minutes, hours, days, weeks,
[16:43.890 --> 16:45.770]  months, or years. Because I'm not going to lie,
[16:45.770 --> 16:48.250]  I don't know off the top of my head how many minutes are in a week.
[16:48.590 --> 16:49.850]  Going down or to the
[16:49.850 --> 16:51.890]  right exponentially increases the amount
[16:51.890 --> 16:52.830]  of work needed.
[16:54.330 --> 16:55.810]  And there's a lot of spots
[16:55.810 --> 16:57.530]  here where the
[16:57.530 --> 16:59.690]  jumps in difficulty are
[16:59.690 --> 17:02.150]  sharp. And this is where the idea of a
[17:02.150 --> 17:04.030]  computational wall comes in. A seemingly
[17:04.030 --> 17:06.710]  sudden increase in difficulty that you can't go over.
[17:06.710 --> 17:08.010]  But of course,
[17:08.010 --> 17:10.490]  this is only on the chart for a consumer GPU.
[17:10.490 --> 17:12.550]  What about a V100?
[17:12.550 --> 17:13.810]  Well, it's not actually that much
[17:13.810 --> 17:14.630]  different.
[17:15.610 --> 17:17.610]  There are a lot of reasons why V100 should be
[17:17.610 --> 17:19.750]  much better for cracking generally, but it isn't
[17:19.870 --> 17:21.770]  a variable for this particular chart. Anyway,
[17:21.770 --> 17:23.750]  what if we have more than one GPU? Why not
[17:23.750 --> 17:25.750]  eight? Well, the chart is going to shift
[17:25.750 --> 17:28.170]  down a little bit and open up some options.
[17:28.170 --> 17:29.850]  Those of you that have noticed the size of the
[17:29.850 --> 17:31.830]  numbers we're dealing with on the vertical axis
[17:32.830 --> 17:33.810]  probably already have
[17:33.810 --> 17:35.710]  some nightmares to take home, but I'm going to make sure
[17:35.710 --> 17:37.270]  everyone else has them too.
[17:37.330 --> 17:39.550]  In the two-word column, that is more than
[17:39.550 --> 17:41.590]  16 million words in 20 minutes.
[17:41.590 --> 17:43.810]  Basically, every word actually in use
[17:43.810 --> 17:45.830]  from every language spoken by at least 1%
[17:45.830 --> 17:47.470]  of the world's population fits in here
[17:47.470 --> 17:49.610]  with room to spare for a couple extra
[17:49.610 --> 17:51.170]  million common passwords.
[17:51.650 --> 17:53.530]  In the three-word column, that is
[17:53.530 --> 17:55.670]  every non-obsolete word in the English language
[17:56.130 --> 17:57.870]  and it's testable in a
[17:57.870 --> 17:59.990]  matter of hours. In the four-word
[17:59.990 --> 18:01.870]  column, we have a sizable chunk of
[18:01.870 --> 18:04.110]  commonly used words where a target attack, say from
[18:04.110 --> 18:06.010]  scraping the memos and websites of the target
[18:06.010 --> 18:08.070]  makes sense.
[18:08.070 --> 18:09.590]  Before I get too excited, I'm sure someone
[18:09.590 --> 18:11.590]  is thinking, okay, cool, that's just
[18:11.590 --> 18:12.990]  NTLM. What about a slower
[18:13.590 --> 18:16.190]  actually secure hash or quote-unquote secure hash?
[18:16.190 --> 18:17.550]  Well, here's that same chart for
[18:18.870 --> 18:19.350]  shop512unitspassphrases
[18:19.350 --> 18:21.490]  and you'll notice there are still some viable
[18:21.490 --> 18:23.750]  attacks on two- and three-word passphrases.
[18:23.790 --> 18:25.470]  And here's bcrypt or blowfish
[18:25.470 --> 18:27.590]  configured with the Unix defaults, albeit this is
[18:27.590 --> 18:29.630]  some pretty weak Unix defaults.
[18:29.630 --> 18:31.670]  I think Ubuntu has gone well past
[18:31.670 --> 18:33.550]  this now. In any case,
[18:35.170 --> 18:37.590]  we lose a lot of capability, but even
[18:37.590 --> 18:39.730]  when that difficulty curve is ramped
[18:39.730 --> 18:41.510]  way the heck up,
[18:41.510 --> 18:43.710]  we still have options in the two-word
[18:43.710 --> 18:45.730]  passphrase category. As an attacker,
[18:45.730 --> 18:47.130]  this chart makes me really sad, but
[18:47.130 --> 18:49.630]  the reality of most corporations using
[18:49.630 --> 18:51.650]  Windows somewhere in infrastructure is that there
[18:51.650 --> 18:53.590]  will be somewhere something using a hash
[18:53.590 --> 18:55.870]  that, like NTLM,
[18:55.870 --> 18:57.950]  isn't going away
[18:57.950 --> 18:59.830]  for quite a while to come.
[19:00.190 --> 19:01.770]  And as far as individual users
[19:01.770 --> 19:03.010]  that are concerned with
[19:03.670 --> 19:06.150]  what they need to have as a passphrase,
[19:06.150 --> 19:08.190]  well, they need to make sure what they have
[19:08.190 --> 19:10.090]  isn't going to get cracked when the hashes get dumped,
[19:10.090 --> 19:12.150]  either inside an institution or
[19:12.150 --> 19:14.050]  when a website gets all of its password dumped by
[19:14.050 --> 19:16.070]  SQL injection. And let's be real, a lot of major
[19:16.070 --> 19:18.090]  websites and companies keep on getting caught with
[19:18.090 --> 19:20.010]  their pants down, and we only find out they're using
[19:20.130 --> 19:21.670]  a fast hash or plaintext
[19:21.670 --> 19:22.990]  the hard way.
[19:23.730 --> 19:26.150]  Not to mention the difficulty of computing a hash
[19:26.150 --> 19:28.090]  is a linear factor in a world where computing
[19:28.090 --> 19:30.090]  power increases exponentially. So I'm
[19:30.090 --> 19:32.010]  going to go back to the other chart, because it's
[19:32.010 --> 19:33.970]  time to get into the attacker mindset and play
[19:34.090 --> 19:35.630]  a game of bad recommendations.
[19:35.990 --> 19:38.270]  First up is Google, and yes,
[19:38.270 --> 19:40.010]  the advice is old, but the
[19:40.010 --> 19:42.130]  advice, much like their policy,
[19:42.130 --> 19:43.890]  hasn't been updated in more than a decade.
[19:43.890 --> 19:45.830]  So, I love sandwiches.
[19:45.850 --> 19:47.670]  It's not a great example, and that mix of
[19:47.670 --> 19:50.410]  leet-speak and case-shifting doesn't do much for the difficulty.
[19:50.410 --> 19:51.910]  I'll be generous and say that it makes it
[19:51.910 --> 19:53.950]  about 100 times harder. In terms
[19:53.950 --> 19:55.550]  of the rarity, this slide shows
[19:55.550 --> 19:57.810]  where the various words fall on Google's
[19:57.810 --> 19:59.950]  own most searched list,
[19:59.950 --> 20:02.010]  that's the G number, and
[20:02.010 --> 20:03.830]  the Wikipedia frequency list,
[20:03.830 --> 20:05.530]  which would be the W number.
[20:06.770 --> 20:07.990]  The fact of the matter
[20:07.990 --> 20:09.990]  here is, the underlying phrase is crackable
[20:09.990 --> 20:11.950]  in under 3 minutes. And even giving the
[20:11.950 --> 20:13.830]  benefit of the doubt and the difficulty of guessing
[20:13.830 --> 20:15.870]  leet-speak substitutions, that's crackable in
[20:15.870 --> 20:17.350]  244 minutes.
[20:17.650 --> 20:19.850]  I should probably also mention the recommendation
[20:19.850 --> 20:21.930]  put forth in the absolute latest NIST and
[20:21.930 --> 20:24.150]  FBI recommendation, because they suggested
[20:24.150 --> 20:25.970]  in a... they suggested the
[20:25.970 --> 20:27.890]  passphrase, voices protected 2020
[20:27.890 --> 20:30.230]  we are, and then suggested the passphrase
[20:30.810 --> 20:32.550]  that is even better.
[20:33.010 --> 20:34.130]  At least according to them.
[20:34.130 --> 20:36.090]  Which is director-month-learn-truck, because
[20:36.090 --> 20:37.630]  the words are unrelated.
[20:38.270 --> 20:40.150]  Well, those words might seem unrelated, but they
[20:40.150 --> 20:41.930]  all have one thing in common, and that's they're all
[20:41.930 --> 20:44.230]  eye-searingly common. The rarest
[20:44.230 --> 20:47.010]  word in there is truck, and oh boy,
[20:47.010 --> 20:47.870]  that isn't the top
[20:47.870 --> 20:50.210]  4,000, regardless of
[20:50.210 --> 20:51.570]  which list you choose.
[20:52.130 --> 20:54.430]  Four common words are not safe from offline cracking,
[20:54.430 --> 20:56.330]  which does bring up another, perhaps more
[20:56.330 --> 20:57.630]  viral recommendation
[20:58.710 --> 20:59.670]  from XKCD
[21:00.450 --> 21:01.410]  936.
[21:02.310 --> 21:03.770]  Now, a lot
[21:03.770 --> 21:05.550]  of people have used this as password
[21:05.550 --> 21:06.570]  advice,
[21:07.870 --> 21:10.090]  including the management intercontinental exchange.
[21:10.090 --> 21:11.990]  And the matter of it is,
[21:11.990 --> 21:13.730]  the math in XKCD's advice
[21:13.730 --> 21:15.690]  as written is actually fine, because it
[21:15.690 --> 21:17.650]  was written to handle an online attack
[21:17.650 --> 21:20.330]  where the rate of guessing is only 1,000 a second.
[21:20.450 --> 21:21.730]  Now, there's
[21:21.910 --> 21:23.690]  a claim here that the average user shouldn't have
[21:23.690 --> 21:25.630]  to worry about attempts to crack a stolen
[21:25.630 --> 21:27.650]  hash, and I'm here to tell you from experience
[21:27.650 --> 21:29.970]  the average user reuses passwords.
[21:30.030 --> 21:31.630]  The average user also works at a company,
[21:31.630 --> 21:33.490]  and if that company is larger than a couple dozen
[21:33.490 --> 21:35.510]  people, they should also worry about stolen or dumped
[21:35.510 --> 21:37.190]  hashes, both from their own infrastructure
[21:37.650 --> 21:39.310]  and the websites their users are
[21:39.310 --> 21:41.610]  using both professionally and personally.
[21:41.970 --> 21:43.570]  Four common words isn't safe from offline
[21:43.570 --> 21:45.170]  attacks. If anything, XKCD
[21:45.650 --> 21:47.690]  the example here is a little stronger than intended
[21:47.690 --> 21:49.370]  because staple isn't actually that common.
[21:49.370 --> 21:51.630]  It's number 11,363
[21:52.590 --> 21:53.750]  on the Wikipedia frequency list
[21:53.750 --> 21:55.670]  and completely off the end of the Google list.
[21:56.110 --> 21:57.610]  Then there's level entropy.
[21:57.610 --> 21:59.850]  XKCD says it's a
[21:59.850 --> 22:01.750]  common word. If we say
[22:01.750 --> 22:04.190]  common words are words that are in the
[22:05.690 --> 22:06.390]  first 2048
[22:06.390 --> 22:07.930]  of any given word list,
[22:07.930 --> 22:09.750]  and the attackers, well,
[22:09.750 --> 22:11.750]  they're going to win all the way out to six word phrases.
[22:11.750 --> 22:16.070]  Yes, really six. Like, good luck Diceware users.
[22:16.070 --> 22:18.350]  This is nasty.
[22:18.350 --> 22:19.930]  This is not how this...
[22:19.930 --> 22:21.570]  there is no way out of this computationally
[22:21.570 --> 22:23.630]  with just throwing on
[22:24.270 --> 22:25.470]  common words.
[22:25.930 --> 22:27.690]  Mind you, there are some caveats.
[22:27.730 --> 22:29.590]  The real world isn't as simple as an Excel
[22:29.590 --> 22:31.550]  sheet, so before we get into real
[22:31.550 --> 22:33.530]  world results and the mechanics of attacks, we
[22:33.530 --> 22:35.630]  should very quickly cover the things that
[22:35.630 --> 22:37.710]  are going to be ever present when attacking these passwords
[22:37.710 --> 22:38.730]  using a GPU.
[22:39.750 --> 22:41.770]  Yeah, this is going to get real technical real quick.
[22:41.770 --> 22:43.530]  First of which is the dictionaries and the size
[22:43.530 --> 22:45.530]  of the dictionaries. There's a lot of bandwidths to go around
[22:45.530 --> 22:47.590]  for operations within a GPU and its memory, but
[22:47.590 --> 22:49.250]  transferring data to the GPU?
[22:49.250 --> 22:51.610]  Well, PCI Gen 3 by 16
[22:51.610 --> 22:53.530]  is only 16 gigabytes per second
[22:53.530 --> 22:55.350]  max theoretical, not
[22:55.350 --> 22:57.810]  real world, and that might seem like a lot,
[22:57.810 --> 22:59.530]  but when we're dealing with 16 billion
[22:59.530 --> 23:02.090]  theoretical bytes per second,
[23:05.430 --> 23:07.370]  modern GPUs can do 26
[23:07.370 --> 23:09.420]  billion real hashes of a passphrase
[23:10.030 --> 23:11.270]  per second,
[23:11.270 --> 23:13.290]  and one of these numbers doesn't fit within the other
[23:13.290 --> 23:15.750]  and also is real, not theoretical.
[23:16.050 --> 23:17.550]  That PCI
[23:17.550 --> 23:19.850]  bandwidth is not going to match up with the hash rates.
[23:19.950 --> 23:21.690]  The other limitation to be mindful in terms
[23:21.690 --> 23:23.690]  of what kind of attacks
[23:23.690 --> 23:25.470]  are possible is the nature of
[23:26.170 --> 23:27.770]  CUDA or the AMD equivalent,
[23:27.770 --> 23:29.290]  I'm not going to get into it, cores in the GPU
[23:29.290 --> 23:32.010]  and what they can do. Compute Unified Divided Architecture,
[23:32.010 --> 23:33.730]  which is, well, not a lot actually.
[23:33.730 --> 23:35.450]  They were built to do lots of single
[23:35.450 --> 23:37.530]  instruction multiple data computations on matrices
[23:37.530 --> 23:39.210]  and vectors for graphics, which means
[23:39.210 --> 23:41.570]  lots and lots of units they can do
[23:41.570 --> 23:43.690]  arithmetic, ALUs,
[23:43.690 --> 23:46.410]  and not a lot of space or silicon devoted to controlling them.
[23:46.410 --> 23:48.530]  The nature of single instruction
[23:48.530 --> 23:50.430]  multiple data means groups, or in this diagram
[23:50.430 --> 23:52.490]  rows of ALUs, all need to
[23:52.490 --> 23:54.510]  be doing the same type of thing at the
[23:54.510 --> 23:56.790]  same time, and those instructions and a
[23:56.790 --> 23:59.170]  significant portion of the data needs to fit in a smaller
[23:59.170 --> 24:00.770]  shared cache, and oh right,
[24:00.770 --> 24:02.470]  the instructions better be simple and something
[24:02.470 --> 24:04.750]  it can actually do. If that something isn't
[24:04.750 --> 24:06.950]  exactly implemented in HashCat or John the Ripper,
[24:06.950 --> 24:08.950]  often the best bet is to find
[24:08.950 --> 24:11.150]  a workaround. Also, thank you HashCat
[24:11.150 --> 24:13.150]  and John the Ripper devs for making it easy
[24:13.150 --> 24:14.370]  to use these devices for cracking.
[24:14.370 --> 24:17.050]  Okay, let's talk about techniques and results.
[24:17.050 --> 24:18.210]  First, dictionary attacks,
[24:18.210 --> 24:20.470]  basically checking through a massive list of passwords
[24:20.470 --> 24:22.250]  in a file, with or without GPU generating
[24:22.250 --> 24:24.710]  some additional candidates to check,
[24:24.710 --> 24:26.650]  though more on rules later.
[24:26.970 --> 24:27.930]  Intercontinental exchange,
[24:29.550 --> 24:30.690]  we've been running a
[24:30.690 --> 24:32.750]  dictionary attack during the period that Jerry
[24:32.750 --> 24:34.470]  mentioned in that soundbite,
[24:34.470 --> 24:36.730]  while we rolled out the 15 character policy,
[24:36.730 --> 24:38.590]  and as you can see in this graph,
[24:38.590 --> 24:42.490]  the average length of a cracked password,
[24:42.490 --> 24:42.750]  the policy
[24:42.750 --> 24:44.990]  wasn't just successful, it caused some rapid
[24:44.990 --> 24:47.190]  jumps in length when it started to be enforced.
[24:47.270 --> 24:48.570]  And after implementation,
[24:48.570 --> 24:50.890]  the average length evened out despite some outlying
[24:50.890 --> 24:52.750]  spikes. But this is the less interesting
[24:52.750 --> 24:54.670]  half of the story. This is the graph of the
[24:54.670 --> 24:56.730]  percentage of passwords that were cracked by
[24:56.730 --> 24:58.770]  dictionary attacks, only dictionary attacks.
[24:59.130 --> 25:00.730]  Yes, that
[25:00.730 --> 25:02.550]  number in the upper left is 50%.
[25:02.550 --> 25:04.970]  Dictionary attacks are very good when you have a
[25:04.970 --> 25:07.210]  dictionary of passwords made of the same password
[25:07.210 --> 25:09.130]  policy that you're targeting.
[25:09.130 --> 25:10.730]  And yes, dictionaries
[25:10.730 --> 25:12.670]  also aren't very good after people change their password
[25:12.670 --> 25:14.770]  under a completely different and
[25:14.770 --> 25:16.630]  significantly changed policy
[25:16.630 --> 25:18.850]  and start picking things that aren't going to be in the dictionary
[25:18.850 --> 25:20.770]  anymore. In fact, if we overlay the
[25:20.770 --> 25:22.710]  two graphs, we see multiple downward shifts
[25:22.710 --> 25:24.810]  in the number of cracked passwords, even after the
[25:24.810 --> 25:26.970]  average of cracked passwords is above 15
[25:26.970 --> 25:28.970]  characters, as further enforcement
[25:28.970 --> 25:31.890]  for cracking cracked passwords was rolled out.
[25:31.890 --> 25:32.770]  Until it settled
[25:32.770 --> 25:34.710]  down and around the 1%
[25:34.710 --> 25:36.650]  range. Mind you, it didn't approach
[25:36.650 --> 25:38.550]  zero. It held down near a certain
[25:38.550 --> 25:40.610]  percentage. There is always somebody
[25:40.610 --> 25:43.110]  picking and re-picking a terrible password.
[25:43.150 --> 25:44.530]  And these attacks are trimmed and don't show
[25:44.530 --> 25:46.530]  anything that has happened recently. But I assure you, straight
[25:46.530 --> 25:48.810]  dictionary attacks have not had much success.
[25:49.750 --> 25:50.490]  Though I was
[25:50.490 --> 25:52.290]  hired around this time. So let's talk
[25:52.290 --> 25:54.510]  combinator attacks. Attacks
[25:54.510 --> 25:55.970]  centered around combining
[25:55.970 --> 25:58.550]  different dictionaries or the
[25:58.550 --> 26:00.850]  same dictionary results into itself.
[26:01.170 --> 26:02.490]  Or sorry, the same dictionary
[26:02.490 --> 26:04.450]  onto itself. The attack here is pretty
[26:04.450 --> 26:06.590]  straightforward and we've kind of already talked about it
[26:06.590 --> 26:08.190]  so let's get right into the results.
[26:08.470 --> 26:10.470]  And well, here's a chart of every
[26:10.470 --> 26:12.650]  passphrase password longer than 15 characters
[26:12.650 --> 26:14.290]  ever cracked in a continental exchange.
[26:14.290 --> 26:16.450]  That was crackable with a small dictionary of
[26:16.450 --> 26:18.450]  words or common passwords organized to
[26:18.450 --> 26:20.610]  reflect the charts from the computational
[26:20.610 --> 26:22.850]  wall part of the presentation a little bit earlier.
[26:23.150 --> 26:24.470]  In order to do this, I had to modify
[26:24.470 --> 26:26.290]  an algorithm for dissecting passwords called
[26:26.290 --> 26:28.470]  zxubn to be able to actually properly
[26:28.470 --> 26:30.170]  handle common separators like spaces
[26:30.170 --> 26:32.490]  between passwords and a passphrase because
[26:32.490 --> 26:34.470]  apparently the creators of zxubn didn't
[26:34.470 --> 26:37.130]  think that was important to handle. Anyway, I digress.
[26:37.710 --> 26:38.670]  A password slide here
[26:38.670 --> 26:40.330]  based on how many elements, usually
[26:40.330 --> 26:42.390]  words, but sometimes a common password
[26:42.390 --> 26:43.290]  zxubn
[26:44.670 --> 26:46.250]  believes it is made up of
[26:46.250 --> 26:48.430]  and the rarity of the rarest element
[26:48.430 --> 26:50.350]  based on the number of guesses
[26:50.350 --> 26:52.670]  required, which should roughly
[26:52.670 --> 26:54.410]  dictate the size of the source dictionary
[26:54.410 --> 26:56.210]  required. And
[26:56.210 --> 26:58.490]  you can see there's a rather precipitous
[26:58.490 --> 27:00.390]  drop-off somewhere between 32,000 and
[27:00.390 --> 27:02.390]  64,000. Along
[27:02.390 --> 27:03.370]  with some pretty significant
[27:03.950 --> 27:06.810]  preference for three element passphrases,
[27:06.810 --> 27:07.070]  which
[27:08.110 --> 27:10.210]  might have had something to do with certain recommendations
[27:10.210 --> 27:11.670]  packaged with the policy.
[27:13.350 --> 27:14.930]  Now, this is neat and all,
[27:14.930 --> 27:16.430]  but what about the other datasets?
[27:16.430 --> 27:18.010]  Say, not from this company.
[27:18.230 --> 27:20.230]  Well, I considered calling up various
[27:20.230 --> 27:22.250]  companies with 15-character policies and seeing
[27:22.250 --> 27:24.110]  if they were willing to give me their hashes.
[27:26.530 --> 27:27.030]  But it
[27:27.030 --> 27:28.390]  didn't work and it seemed pretty unlikely,
[27:28.390 --> 27:29.970]  so I had to artificially make one.
[27:29.970 --> 27:31.870]  So this is the result of over
[27:31.870 --> 27:34.290]  37 million hashes from the have-I-been-pwned
[27:34.290 --> 27:35.890]  version 2 collection, where every
[27:35.890 --> 27:37.970]  password known to be less than 15 characters long
[27:37.970 --> 27:40.050]  has been stripped out, which is really only possible
[27:40.050 --> 27:42.190]  because over 90% of this list has already been cracked.
[27:42.450 --> 27:43.990]  Both of the attacks here are pretty
[27:43.990 --> 27:46.030]  shallow and short-running, in large part because I
[27:46.030 --> 27:47.850]  only needed to prove a point, and
[27:47.850 --> 27:49.910]  also the hashrate is artificially terrible when
[27:49.910 --> 27:51.810]  cross-checking 37 million
[27:51.810 --> 27:54.110]  hashes for matches. That being said,
[27:54.110 --> 27:55.850]  this is 1.14% of
[27:55.850 --> 27:57.750]  all the passphrases in the dataset
[27:57.750 --> 28:00.070]  in less than 3 minutes.
[28:00.370 --> 28:01.630]  Scared yet?
[28:01.710 --> 28:03.990]  Good! Because this is an example
[28:03.990 --> 28:05.770]  of what happens when you use a lot more candidates
[28:05.770 --> 28:07.730]  and checking combinations of
[28:07.730 --> 28:09.830]  passwords. If you still remember
[28:09.830 --> 28:11.390]  that third strawman argument,
[28:11.390 --> 28:13.710]  this is all of the ROKU word
[28:13.710 --> 28:15.930]  list on top of itself. Sure, it takes
[28:16.130 --> 28:17.830]  a day and 16 hours on some piddly
[28:17.830 --> 28:19.570]  GPU horrifically bogged down by
[28:19.570 --> 28:21.510]  37 million hashes to crack.
[28:21.750 --> 28:23.750]  Normally in cloud infrastructure, this would take
[28:23.750 --> 28:25.690]  16 minutes. For a single machine in
[28:25.690 --> 28:27.410]  AWS or Google Cloud with those
[28:27.410 --> 28:30.110]  8 GPUs, it would cost single-digit dollars.
[28:30.870 --> 28:31.790]  Now that I've got
[28:31.790 --> 28:33.770]  the would-be crackers excited, there are some technical
[28:33.770 --> 28:36.030]  shortfalls with HashCat that also need to be noted.
[28:36.030 --> 28:37.710]  The combination mode only takes 2 files
[28:37.710 --> 28:39.670]  as input. You want to use
[28:39.670 --> 28:41.590]  this so HashCat can load as much as it can on
[28:41.590 --> 28:43.790]  GPU RAM ahead of time, which
[28:43.790 --> 28:45.790]  means if we want to check 3 or 4
[28:45.790 --> 28:47.510]  element passphrases,
[28:47.510 --> 28:49.830]  you need intermediate dictionaries, being mindful
[28:49.830 --> 28:51.430]  of the exponential file size.
[28:51.610 --> 28:53.810]  And it's a similar deal for the hybrid mode,
[28:53.810 --> 28:55.510]  which can only take one word list and
[28:55.510 --> 28:57.690]  one mask file. You could
[28:57.690 --> 28:59.730]  use other programs and pipe in candidates,
[28:59.730 --> 29:02.130]  but you pay the piper when doing so in a fast hash.
[29:02.330 --> 29:03.650]  And the other thing
[29:03.650 --> 29:05.310]  HashCat isn't going to do for you is
[29:05.310 --> 29:07.610]  generate variants of passphrase candidates with
[29:07.610 --> 29:09.730]  all the spaces and capitalization that people generally
[29:09.730 --> 29:11.730]  use. From what I've seen, you're
[29:11.730 --> 29:13.690]  going to want title case, sentence, and all
[29:13.690 --> 29:15.970]  lowercase, both with and without spaces.
[29:16.630 --> 29:17.590]  No, I do not consider
[29:17.590 --> 29:19.450]  spaces a special character anymore.
[29:19.510 --> 29:21.210]  The easiest way to do this is with
[29:21.210 --> 29:23.250]  individual rules on a
[29:23.250 --> 29:25.890]  combinator tack, starting with a preprocessed
[29:25.890 --> 29:28.090]  dictionary that already has the spaces
[29:28.090 --> 29:29.330]  between capitalized words.
[29:29.330 --> 29:31.710]  Then you can manipulate it easily enough to get the spaces
[29:31.710 --> 29:33.510]  and capitalization desired. For HashCat,
[29:33.510 --> 29:35.970]  J corresponds to the left word list, K corresponds to the
[29:36.010 --> 29:37.830]  right list. Which, on the topic
[29:37.830 --> 29:39.530]  of rules, is another shortfall.
[29:39.530 --> 29:41.890]  You can only do one set of rules per word list in
[29:41.890 --> 29:43.450]  combination mode. Yes, sorry, if you
[29:43.450 --> 29:45.850]  want to check lots of different things using rules
[29:45.850 --> 29:48.570]  like common words in between two words,
[29:48.570 --> 29:49.090]  you're going to need
[29:49.810 --> 29:51.070]  a honking, massive, and ugly
[29:51.070 --> 29:53.130]  bash file, or a program to manage it for you.
[29:53.130 --> 29:55.050]  This is actually what TrustedSexHatecrack is
[29:55.050 --> 29:56.930]  doing behind the scenes for what they call middle
[29:56.930 --> 29:58.850]  or thorough combinator attacks. It's basically
[29:58.850 --> 30:01.190]  spawning off tons of
[30:01.190 --> 30:02.810]  HashCat command lines.
[30:03.870 --> 30:04.570]  Alright.
[30:05.370 --> 30:07.150]  Enough about conditions of combinators,
[30:07.150 --> 30:08.950]  let's move on to PrinceAttacks, which, despite
[30:08.950 --> 30:10.870]  the other name for the attack, has
[30:10.870 --> 30:13.610]  nothing to do with the artist formerly known as.
[30:13.710 --> 30:15.370]  Prince is a
[30:15.370 --> 30:17.750]  acronym for Probability Infinite Chained Elements,
[30:17.750 --> 30:19.550]  but it is pretty
[30:20.090 --> 30:21.690]  simple in purpose. Take
[30:21.690 --> 30:23.670]  the path of least resistance and find more
[30:23.670 --> 30:25.650]  passphrases sooner by outputting shorter linked
[30:25.650 --> 30:27.670]  candidates first. Not only is the search
[30:27.670 --> 30:29.470]  space smaller, but we can also expect to find more
[30:29.470 --> 30:31.510]  passphrases here because people tend to pick
[30:31.510 --> 30:34.130]  passphrases close to the minimum required effort.
[30:34.350 --> 30:35.810]  This overlap of least resistance
[30:35.810 --> 30:37.870]  motivations is just great for an attacker.
[30:38.430 --> 30:39.710]  How does Prince work? Well, it
[30:39.710 --> 30:41.670]  ingests a word list, sorts all the contents
[30:41.670 --> 30:43.690]  by links, and puts them into separate
[30:43.690 --> 30:45.670]  lists, and then begins producing
[30:45.670 --> 30:47.810]  candidates at the minimum specified length
[30:47.810 --> 30:50.430]  out of the lists of the various links,
[30:50.430 --> 30:51.910]  iterating through all possible combinations
[30:51.910 --> 30:54.150]  of lists until the options are exhausted,
[30:54.150 --> 30:55.890]  at which point it moves on to making longer
[30:55.890 --> 30:58.070]  candidates. Now,
[30:58.070 --> 30:59.750]  after that explanation of how Prince works, you might
[30:59.750 --> 31:01.930]  be wondering, why not just
[31:01.930 --> 31:04.550]  use combinators of length-cut lists?
[31:04.570 --> 31:05.750]  Aren't Prince and Hashcat separate
[31:05.750 --> 31:07.710]  programs? Won't I have to pay the piper? Didn't you
[31:07.710 --> 31:09.730]  just say that transferring all that stuff for PCIe was
[31:09.730 --> 31:12.050]  terrible for performance? Well, yes,
[31:12.050 --> 31:13.770]  you do have to pay the piper, and you could
[31:13.770 --> 31:15.910]  manage all of these attacks manually,
[31:15.910 --> 31:17.950]  but it gets pretty crazy to manage
[31:17.950 --> 31:19.830]  all the lists and combinations for 3- and 4-word
[31:19.830 --> 31:21.930]  elements. It's doable, but you really need another
[31:21.930 --> 31:23.630]  programmer script, and
[31:23.630 --> 31:25.830]  there's a solution that circumvents this,
[31:25.830 --> 31:28.150]  and it comes back to one of the undermining factors.
[31:28.870 --> 31:29.630]  This is
[31:30.450 --> 31:31.830]  the positions of dictionary
[31:31.830 --> 31:33.850]  elements in passphrases. Larger
[31:33.850 --> 31:35.830]  numbers here mean more passphrases contained
[31:35.830 --> 31:38.190]  in elements straight out of a dictionary in that position.
[31:38.190 --> 31:38.750]  And there is
[31:39.750 --> 31:41.930]  one heck of a disparity between the last
[31:41.930 --> 31:44.670]  position in a passphrase and the rest of a passphrase.
[31:44.810 --> 31:45.830]  Because humans pack
[31:45.830 --> 31:48.190]  complexity on the ends. Most of the end,
[31:48.190 --> 31:50.150]  sometimes in the beginning, extremely
[31:50.150 --> 31:51.810]  rarely in the middle.
[31:52.550 --> 31:53.970]  So, the fix for Prince's
[31:53.970 --> 31:55.530]  terrible hashrate on fast hashes
[31:55.530 --> 31:58.430]  when it's piped in? Rules.
[31:58.450 --> 32:00.410]  Because you can use the default
[32:00.410 --> 32:02.490]  hashcat cracking mode, you can specify
[32:02.490 --> 32:04.530]  with a rules file.
[32:04.530 --> 32:06.770]  Any problems? Well, yeah, most rule
[32:06.770 --> 32:08.610]  lists are made by people on the internet
[32:08.610 --> 32:10.150]  and aren't long enough to make use of a GPU's
[32:10.150 --> 32:12.170]  spare resources or are more focused on modifying
[32:12.170 --> 32:14.350]  passwords and not altering
[32:14.350 --> 32:16.330]  passphrases, like adding suffixes
[32:16.330 --> 32:18.230]  or prefixes. So, I've made
[32:18.230 --> 32:20.170]  them one more focused on common prefixes
[32:20.170 --> 32:21.530]  and suffixes,
[32:22.070 --> 32:24.450]  you know, your 20, 20, your exclamation point,
[32:24.450 --> 32:25.330]  etc. With
[32:25.330 --> 32:28.910]  4,175 rules,
[32:28.910 --> 32:30.070]  which is up in the
[32:30.070 --> 32:32.330]  rephraser repo I'll mention later,
[32:32.330 --> 32:34.170]  and soon to come I'll also be releasing a prefix
[32:34.170 --> 32:36.250]  and suffix list, which probably is already up by
[32:36.250 --> 32:38.030]  now, based on the frequency data
[32:38.030 --> 32:40.050]  of those cracked password phrases mentioned
[32:40.050 --> 32:42.670]  earlier in the have-I-been-pwned-version-2 list.
[32:43.130 --> 32:43.830]  Okay.
[32:43.830 --> 32:45.110]  Let's talk success rates
[32:45.830 --> 32:47.150]  for print.
[32:48.970 --> 32:50.030]  Now, the
[32:50.030 --> 32:51.430]  kicker here is that
[32:52.030 --> 32:53.950]  we're making use of that rule list I mentioned
[32:53.950 --> 32:56.210]  to catch passphrases with common suffixes
[32:56.210 --> 32:57.830]  and prefixes. There are no
[32:57.830 --> 32:59.970]  single-element passphrases here. We can
[32:59.970 --> 33:02.730]  see a repeat of an earlier trend.
[33:02.990 --> 33:04.070]  Namely, a sudden
[33:04.070 --> 33:06.350]  drop-off between 32,000 and 64,000
[33:06.350 --> 33:08.630]  with some clustering around three elements,
[33:08.630 --> 33:10.410]  including the suffix or prefix.
[33:10.990 --> 33:12.390]  You've got to remember all of these are being
[33:12.390 --> 33:13.430]  run with the rules.
[33:14.030 --> 33:16.170]  This works out pretty well for sussing out
[33:16.170 --> 33:18.290]  passphrases, at least within
[33:18.290 --> 33:19.670]  your continental exchange.
[33:20.570 --> 33:22.390]  What about a much larger dataset?
[33:22.630 --> 33:24.450]  Well, we can also see the reason
[33:24.450 --> 33:26.390]  why people
[33:26.390 --> 33:27.990]  like using print's attacks.
[33:28.070 --> 33:30.090]  Both of these attacks were using the
[33:30.090 --> 33:32.090]  RockU password dictionary as the source, and
[33:32.090 --> 33:33.870]  both were stopped at the same time at
[33:33.870 --> 33:36.070]  1.8, the runtime of the RockU squared
[33:36.070 --> 33:38.670]  combinator, RockU on top of itself.
[33:38.710 --> 33:40.010]  That complete attack
[33:40.010 --> 33:41.890]  I mentioned earlier took one day, 16
[33:41.890 --> 33:43.970]  hours on our piddly setup, and got
[33:43.970 --> 33:45.930]  5.82%. These attacks
[33:45.930 --> 33:47.770]  managed to concentrate a significant
[33:47.770 --> 33:50.450]  portion of that success very early in cracking.
[33:50.470 --> 33:51.870]  The other comparison here
[33:51.870 --> 33:53.990]  is the performance between restricting the
[33:53.990 --> 33:55.750]  number of elements print is allowed to use
[33:55.750 --> 33:57.750]  in the attack. At least
[33:57.750 --> 34:00.150]  in this case,
[34:00.150 --> 34:02.150]  there was a limited but significant advantage
[34:02.150 --> 34:03.990]  to focusing only on three-word element
[34:03.990 --> 34:06.090]  passphrases. Because again, we're also using
[34:06.090 --> 34:07.990]  that ruleset mentioned earlier with the prefixes
[34:07.990 --> 34:10.150]  and suffixes, based on the print's candidates.
[34:10.370 --> 34:11.990]  I'll also say again,
[34:11.990 --> 34:14.250]  this was at a really abysmal hash rate.
[34:14.530 --> 34:16.030]  We barely were reaching
[34:16.030 --> 34:18.050]  one gigahash when doing this, and
[34:18.050 --> 34:20.210]  these attacks can be much, much faster
[34:20.210 --> 34:22.190]  when you're not drowning your GPU
[34:22.190 --> 34:24.370]  in 37 million hashes to check.
[34:25.010 --> 34:26.410]  And speaking of fast,
[34:26.410 --> 34:28.250]  it's about time I get to the new and shiny.
[34:28.250 --> 34:30.270]  Word-level markup chains for going after
[34:30.270 --> 34:31.870]  passwords that are at their core
[34:31.870 --> 34:33.470]  phrases.
[34:33.930 --> 34:36.030]  What are word-level markup chains?
[34:36.030 --> 34:38.070]  The simplest explanation I can give is
[34:38.070 --> 34:40.050]  it's an over-glorified state model
[34:40.050 --> 34:42.690]  to store relationships between groups of words.
[34:42.690 --> 34:44.170]  For this example,
[34:44.170 --> 34:46.710]  we'll just do a two-gram model,
[34:46.710 --> 34:48.410]  think two elements in the state.
[34:48.410 --> 34:50.330]  If the phrase used for training is
[34:50.330 --> 34:51.850]  the cows eat grass,
[34:52.150 --> 34:54.270]  a two-gram model will record state transitions
[34:54.270 --> 34:56.070]  for the cows to eat
[34:56.070 --> 34:58.690]  and cows eat to grass.
[34:58.690 --> 35:00.930]  Obviously, this isn't a terribly
[35:00.930 --> 35:02.570]  useful model by itself, but
[35:02.570 --> 35:04.790]  imagine if it had some more training data,
[35:04.790 --> 35:06.870]  say, hundreds, thousands, or millions
[35:06.870 --> 35:08.750]  of sentences. The intent is to build up
[35:08.890 --> 35:11.210]  a model of the relationships between groups of words.
[35:11.210 --> 35:12.790]  And if we want to be more accurate, we can also
[35:12.790 --> 35:14.610]  increase the number of elements in the state,
[35:14.610 --> 35:16.830]  or raise the number-gram of the
[35:16.830 --> 35:18.570]  model to three or beyond.
[35:19.190 --> 35:20.650]  Though research generally
[35:20.650 --> 35:22.490]  seems to show that three is
[35:22.490 --> 35:24.670]  probably the point where you should stop.
[35:25.430 --> 35:26.650]  Now, there is
[35:26.650 --> 35:28.690]  one pre-existing proof of concept
[35:28.690 --> 35:30.390]  I could find that does
[35:30.390 --> 35:31.730]  more or less this
[35:33.130 --> 35:34.670]  and actually would generate
[35:34.670 --> 35:37.170]  output fast enough to be useful in password cracking.
[35:37.170 --> 35:38.670]  Sadly, it was single-threaded,
[35:38.670 --> 35:40.890]  written in a mix of C-sharp and C++,
[35:40.890 --> 35:42.710]  and designed with a Windows Compile
[35:42.710 --> 35:44.670]  target in mind. That simply wouldn't do
[35:44.670 --> 35:46.710]  for my purposes, so I created something similar in Python
[35:46.710 --> 35:48.950]  that is parallel, uses existing libraries,
[35:48.950 --> 35:51.010]  and is naturally more multiplatform.
[35:51.030 --> 35:52.490]  So, how well does it do?
[35:52.810 --> 35:54.630]  Well, the astute members of the audience
[35:54.630 --> 35:55.950]  when I showed this table
[35:55.950 --> 35:58.090]  on the right of every
[35:58.830 --> 36:00.090]  passphrase crack that was
[36:00.390 --> 36:02.410]  combinator-findable may have noticed
[36:02.410 --> 36:03.450]  some numbers that were
[36:04.150 --> 36:06.510]  a little too far into the lower right
[36:06.510 --> 36:08.370]  based on the computational complexity charts
[36:08.370 --> 36:09.750]  we went through before.
[36:10.010 --> 36:11.850]  Let me color that in for you.
[36:12.350 --> 36:13.930]  Now, the very astute of you
[36:13.930 --> 36:16.270]  might also infer that I did not
[36:16.270 --> 36:19.450]  in fact have unlimited access to eight V100s,
[36:19.450 --> 36:20.730]  and also
[36:20.730 --> 36:22.170]  that chart on the right is
[36:22.170 --> 36:24.010]  strangely positioned almost as if it
[36:24.010 --> 36:26.150]  extends further to the right.
[36:26.230 --> 36:27.770]  Well, it does.
[36:27.890 --> 36:30.730]  And I had a single 2080 tie for my research.
[36:30.910 --> 36:32.090]  I was also
[36:32.090 --> 36:34.070]  heavily timeboxing everything on here
[36:34.070 --> 36:36.210]  that isn't a shade of green, not from
[36:36.210 --> 36:38.370]  combinator attacks. Flat out.
[36:38.550 --> 36:40.650]  Admittedly, it's not a deluge of results,
[36:40.650 --> 36:42.430]  but, come on.
[36:42.570 --> 36:44.270]  Years. Literal years
[36:44.270 --> 36:46.610]  for the combinator attack would not have yielded some of these.
[36:46.610 --> 36:48.130]  And the couple out in the six
[36:48.130 --> 36:49.890]  and seven element column?
[36:49.890 --> 36:52.110]  Do you have any idea how long 10 to the 13 is?
[36:52.110 --> 36:53.690]  Because that's what it is corresponding
[36:53.690 --> 36:55.990]  on the difficulty computation.
[36:55.990 --> 36:57.230]  It's millennia.
[36:57.610 --> 37:00.490]  Certainly I'm very happy, but what about our much larger dataset?
[37:00.870 --> 37:01.950]  Well, a similar story.
[37:01.950 --> 37:03.610]  These attacks are very quick
[37:03.610 --> 37:05.630]  when doing only a couple of words,
[37:05.630 --> 37:07.630]  but admittedly don't pull in too much.
[37:07.830 --> 37:09.870]  Only tens of thousands, you know.
[37:10.410 --> 37:11.830]  The other factor you're likely
[37:11.830 --> 37:13.830]  to play here is there's just
[37:13.830 --> 37:15.530]  not that many passphrases that are also
[37:15.530 --> 37:17.650]  phrases that exist in the dataset with
[37:17.650 --> 37:18.890]  four words.
[37:19.290 --> 37:21.530]  While we're here, I would like to point out the accuracy tradeoff
[37:21.530 --> 37:23.690]  of choosing a 3-gram over 2-gram
[37:23.690 --> 37:24.470]  model.
[37:27.310 --> 37:29.610]  You can see that the 3-gram got a little over
[37:29.610 --> 37:31.530]  half as much, but it only took
[37:31.530 --> 37:33.310]  1 twelfth the time.
[37:33.570 --> 37:35.390]  That is the accuracy tradeoff.
[37:35.470 --> 37:37.450]  Now, before people get too uninterested, I should
[37:37.450 --> 37:39.330]  show off the really impressive thing for me,
[37:39.330 --> 37:41.470]  which is what happens when you put these to use
[37:41.470 --> 37:43.250]  on even longer passphrases.
[37:43.850 --> 37:45.550]  There still shouldn't be much out
[37:45.550 --> 37:47.390]  here in the search space, but people seem
[37:47.390 --> 37:49.770]  content to use phrases at this length.
[37:49.770 --> 37:51.350]  Possibly because these should be safe,
[37:51.350 --> 37:53.450]  according to everybody's advice. And these aren't
[37:53.450 --> 37:55.210]  even common phrases, they're just kind of
[37:55.210 --> 37:57.390]  normal phrases. Here are some
[37:57.390 --> 37:59.210]  examples of what isn't safe anymore.
[37:59.470 --> 38:01.690]  These probably aren't the passphrases
[38:01.690 --> 38:03.430]  you expected to leave here knowing are
[38:03.430 --> 38:05.550]  crackable, and certainly not crackable in 1, 2,
[38:05.550 --> 38:07.830]  or 9 hours at speeds barely over 1GHz
[38:07.950 --> 38:09.830]  a second. Do you have nightmares yet?
[38:09.830 --> 38:11.550]  Because everything here was generated
[38:11.550 --> 38:13.210]  by a Markov model and a simple
[38:13.210 --> 38:15.170]  hashcat rule set. No dump passwords
[38:15.170 --> 38:17.330]  as input, no cherry picking shenanigans,
[38:17.330 --> 38:19.070]  and the training data was completely agnostic
[38:19.070 --> 38:21.290]  to the target passphrases. No targeting
[38:21.290 --> 38:23.270]  of any kind. Admittedly,
[38:23.270 --> 38:25.350]  it does produce some strange candidates sometimes,
[38:25.350 --> 38:27.350]  and I've collected some of the weirder ones
[38:27.350 --> 38:28.650]  I've seen on the right here.
[38:29.070 --> 38:31.150]  I don't know how he went from intrauterine
[38:31.150 --> 38:33.170]  pressure catheters to Kiwi or Kiwis. I can
[38:33.170 --> 38:35.230]  only assume one of the websites scraped from the training data
[38:35.230 --> 38:37.390]  was a blog post about making Kiwi or Kiwis
[38:37.390 --> 38:39.490]  with an embedded catheter cowboy advertisement.
[38:39.490 --> 38:40.750]  I really don't know.
[38:40.750 --> 38:42.870]  Now I should head off to some common questions
[38:42.870 --> 38:44.750]  I'm anticipating. Some of you
[38:44.750 --> 38:46.730]  familiar with hashcat might be thinking,
[38:46.730 --> 38:48.730]  um, doesn't hashcat already have
[38:48.730 --> 38:51.050]  Markov chains? Why can't we just do this?
[38:51.150 --> 38:52.730]  Well, it does. It has a pre-made model
[38:52.730 --> 38:54.750]  for character-level predictions,
[38:54.750 --> 38:56.690]  which, when you start going out to 15 character
[38:56.690 --> 38:58.650]  passwords and what you want is multiple actual
[38:58.650 --> 39:01.010]  words, that model falls apart pretty quickly.
[39:01.010 --> 39:02.670]  Not to mention it doesn't store relationships
[39:02.670 --> 39:04.670]  between words, and it will basically
[39:04.670 --> 39:06.890]  never finish, because the search space is at least
[39:07.550 --> 39:09.250]  333 sextillion combinations.
[39:09.250 --> 39:11.290]  If you're thinking, haven't you tried
[39:11.290 --> 39:13.190]  this with GPT-2 or some other
[39:13.190 --> 39:14.490]  machine learning algorithm?
[39:15.470 --> 39:17.150]  My answer is, have you considered
[39:17.150 --> 39:19.310]  hash rates? Markov models are good for this
[39:19.310 --> 39:21.370]  because they're quick, especially compared to massive
[39:21.370 --> 39:23.250]  models like GPT-2 or
[39:23.250 --> 39:25.370]  some other ML model, which by itself
[39:25.370 --> 39:27.190]  is more than 5 gigabytes and tends
[39:27.190 --> 39:29.490]  to take multiple seconds to pop out a
[39:29.490 --> 39:31.510]  single prediction. We need millions of predictions
[39:32.790 --> 39:33.350]  in a second for it
[39:33.350 --> 39:35.250]  to be viable in password cracking.
[39:35.250 --> 39:37.270]  The other side of this is, the more advanced
[39:37.270 --> 39:39.230]  models focus on making paragraphs, or
[39:39.230 --> 39:41.150]  complete pages. We barely need
[39:41.150 --> 39:42.950]  sentences. And there are very few passwords
[39:42.950 --> 39:46.770]  I've seen that resemble more than a
[39:46.770 --> 39:48.470]  sentence fragment.
[39:48.470 --> 39:49.390]  Alright, head back
[39:49.390 --> 39:51.750]  out of the weeds here, it's time to wrap up and summarize.
[39:51.750 --> 39:53.610]  If you're not an attacker, you've probably
[39:53.610 --> 39:55.710]  been sitting through this talk, mildly
[39:55.710 --> 39:57.530]  terrified, waiting for some kind of
[39:57.530 --> 39:59.510]  recommendation. Allow me
[39:59.510 --> 40:01.370]  to make you wait slightly longer by addressing
[40:01.370 --> 40:04.250]  the red teamers first. Red teams,
[40:04.250 --> 40:05.370]  please, don't
[40:05.370 --> 40:07.230]  cop out. 15 character passwords
[40:07.230 --> 40:09.730]  are hard, but they aren't that hard to crack.
[40:10.130 --> 40:11.430]  Even against Bcrypt, you should
[40:11.430 --> 40:13.370]  at least be doing Combinators, PrinceAttacks,
[40:13.370 --> 40:15.750]  and MarkovChains for the duration of your engagement.
[40:15.910 --> 40:17.510]  And don't pretend your results should
[40:17.510 --> 40:19.170]  approach zero, it should never
[40:19.170 --> 40:21.470]  be zero. There will always
[40:21.470 --> 40:22.910]  be somebody who picks something like
[40:23.610 --> 40:25.670]  HelloSpring2020 as their passphrase.
[40:25.930 --> 40:27.190]  Blue team, your turn
[40:27.190 --> 40:29.330]  finally. Do the math that applies to
[40:29.330 --> 40:31.250]  your situation, and do a little
[40:31.250 --> 40:32.950]  more than just gather people that
[40:32.950 --> 40:34.930]  scored highly on the SAT more
[40:34.930 --> 40:36.590]  than a decade ago, and
[40:36.590 --> 40:39.230]  pack them into a conference room and whiteboard it.
[40:39.270 --> 40:40.870]  If intercepted or dumped hashes
[40:40.870 --> 40:43.610]  or personal password reuse is a
[40:43.610 --> 40:45.630]  concern for you, and it almost always should be,
[40:45.630 --> 40:47.330]  feel free to take my math and represent it.
[40:47.330 --> 40:49.170]  Unless your organization is cracking
[40:49.170 --> 40:51.090]  its own passwords, you will not know what people
[40:51.090 --> 40:53.450]  are using. I recommend the best
[40:53.450 --> 40:55.450]  behaviors, but plan for the absolute worst.
[40:55.450 --> 40:57.250]  Which brings me to something that
[40:57.250 --> 40:59.170]  can serve as a starting point for
[40:59.170 --> 41:00.770]  what to pass along to end users.
[41:00.770 --> 41:03.090]  And this is my personal recommendation,
[41:03.090 --> 41:04.710]  which of course is fairly paranoid, but the
[41:04.710 --> 41:06.770]  intent here is to make sure
[41:06.970 --> 41:08.770]  a hash, regardless of what type
[41:08.770 --> 41:10.770]  of hash it is, can't be cracked
[41:10.770 --> 41:13.070]  in $500 of expense.
[41:13.290 --> 41:14.450]  Which means you're going to be making
[41:14.450 --> 41:16.490]  illogical or highly unlikely phrases
[41:16.490 --> 41:18.750]  of five or more relatively uncommon words
[41:18.750 --> 41:20.710]  that are thematically unrelated.
[41:21.350 --> 41:22.070]  Whew.
[41:22.610 --> 41:24.690]  That seems like a lot, but we're talking about
[41:24.690 --> 41:26.690]  diabetic unicorn trap Chicago tunnel
[41:26.690 --> 41:28.210]  as a passphrase.
[41:28.210 --> 41:30.430]  It takes a little
[41:30.430 --> 41:32.270]  bit to make, but these
[41:32.270 --> 41:34.330]  things should be able to hang around for a while.
[41:34.330 --> 41:35.770]  And unless you force an attacker to use
[41:36.330 --> 41:38.170]  a large dictionary combining multiple types
[41:38.170 --> 41:40.270]  of words, cities, structures, fantasy creatures, verbs,
[41:40.270 --> 41:42.290]  medical conditions, and use some words
[41:42.290 --> 41:44.110]  that are uncommon, you're going to be building your
[41:44.110 --> 41:46.390]  security posture with cracks at the foundation.
[41:46.390 --> 41:47.870]  There will be some
[41:47.870 --> 41:50.090]  edge case that will allow them to crack
[41:50.090 --> 41:51.010]  your passwords.
[41:52.370 --> 41:54.230]  So here's a very
[41:54.230 --> 41:56.090]  overfull slide of everything.
[41:56.090 --> 41:58.230]  Any questions? I am in the
[41:58.230 --> 42:00.230]  Discord channel waiting by this
[42:00.230 --> 42:01.250]  time.
[42:02.870 --> 42:04.190]  Rephraser is up there at the
[42:04.190 --> 42:06.210]  top, though it's probably just easier to
[42:06.210 --> 42:07.990]  look for GitHub trap code
[42:07.990 --> 42:09.950]  rephraser in Google.
[42:10.810 --> 42:12.050]  And, yeah.
[42:13.010 --> 42:14.350]  Feedback is appreciated.
